Quantum Exploration Algorithms for Multi-Armed Bandits
نویسندگان
چکیده
Identifying the best arm of a multi-armed bandit is central problem in optimization. We study quantum computational version this with coherent oracle access to states encoding reward probabilities each as amplitudes. Specifically, we show that can find fixed confidence using $\tilde{O}\bigl(\sqrt{\sum_{i=2}^n\Delta^{\smash{-2}}_i}\bigr)$ queries, where $\Delta_{i}$ represents difference between mean and $i^\text{th}$-best arm. This algorithm, based on variable-time amplitude amplification estimation, gives quadratic speedup compared possible classical result. also prove matching lower bound (up poly-logarithmic factors).
منابع مشابه
Algorithms for Differentially Private Multi-Armed Bandits
We present differentially private algorithms for the stochastic Multi-Armed Bandit (MAB) problem. This is a problem for applications such as adaptive clinical trials, experiment design, and user-targeted advertising where private information is connected to individual rewards. Our major contribution is to show that there exist (ǫ, δ) differentially private variants of Upper Confidence Bound alg...
متن کاملDistributed Exploration in Multi-Armed Bandits
We study exploration in Multi-Armed Bandits in a setting where k players collaborate in order to identify an ε-optimal arm. Our motivation comes from recent employment of bandit algorithms in computationally intensive, large-scale applications. Our results demonstrate a non-trivial tradeoff between the number of arm pulls required by each of the players, and the amount of communication between ...
متن کاملPure Exploration in Multi-armed Bandits Problems
We consider the framework of stochastic multi-armed bandit problems and study the possibilities and limitations of strategies that explore sequentially the arms. The strategies are assessed in terms of their simple regrets, a regret notion that captures the fact that exploration is only constrained by the number of available rounds (not necessarily known in advance), in contrast to the case whe...
متن کاملCombinatorial Pure Exploration of Multi-Armed Bandits
We study the combinatorial pure exploration (CPE) problem in the stochastic multi-armed bandit setting, where a learner explores a set of arms with the objective of identifying the optimal member of a decision class, which is a collection of subsets of arms with certain combinatorial structures such as size-K subsets, matchings, spanning trees or paths, etc. The CPE problem represents a rich cl...
متن کاملAlmost Optimal Exploration in Multi-Armed Bandits
We study the problem of exploration in stochastic Multi-Armed Bandits. Even in the simplest setting of identifying the best arm, there remains a logarithmic multiplicative gap between the known lower and upper bounds for the number of arm pulls required for the task. This extra logarithmic factor is quite meaningful in nowadays large-scale applications. We present two novel, parameterfree algor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2021
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v35i11.17212